Members
Overall Objectives
Research Program
Application Domains
Highlights of the Year
New Software and Platforms
New Results
Bilateral Contracts and Grants with Industry
Partnerships and Cooperations
Dissemination
Bibliography
XML PDF e-pub
PDF e-Pub


Section: New Results

Communities and Social Interactions Analysis

Ontologies-Based Platform for Sociocultural Knowledge Management

Participants : Papa Fary Diallo, Olivier Corby, Isabelle Mirbel.

This work is done in the PhD Thesis of P. F. Diallo . We designed a sociocultural platform aiming at persevering and capitalizing sociocultural events in Senegal. This platform relies on Semantic Web technologies. We provided two ontologies to support our platform: an upper level sociocultural ontology (USCO) and a human time ontology (HuTO). To build our upper level ontology we proposed a methodology based on the theory of Russian psychologist Lev Vygotsky called "Vygotskian Framework". We designed the Human Time Ontology (http://ns.inria.fr/huto) (HuTO) of which major contributions are (i) the modeling of non convex intervals (repetitive interval) like every Monday, (ii) the representation of deictic temporal expressions (e.g. today) which form specific relations with time speech and (iii) qualitative temporal notions which are temporal notions relative to a culture or a geographical position. The platform allows Senegalese communities to share and co-construct their sociocultural knowledge. This work was published in the Journal of Data Semantics [14].

SMILK - Social Media Intelligence and Linked Knowledge

Participants : Farhad Nooralahzadeh, Elena Cabrio, Molka Dhouib, Fabien Gandon.

Automated Natural Language Processing (NLP), Web Open Data (Linked Open Data) and social networks are the three topics of the SMILK ANR LabCom including their coupling studied in three ways: texts and Linked Data, Linked Data and social resources, texts and social resources. It is a Joint laboratory between the Inria research institute and the VISEO company to develop research and technologies, retrieve, analyze, and reason about linking data from textual Web resources and other to use open Web data taking into account the social structures and interactions in order to improve the analysis and understanding of textual resources.

In this context, we have developed an entity discovery tool by adopting the semantic spreading activation, and we integrated it in the SMILK framework. The goal of such a tool is to semantically enrich the data by linking the mentions of named entities in the text to the corresponding known entities in knowledge bases. In our approach multiple aspects are considered: the prior knowledge of an entity in Wikipedia (i.e. the keyphraseness and commonness features that can be precomputed by crawling the Wikipedia dump), a set of features extracted from the input text and from the knowledge base, along with the correlation/relevancy among the resources in Linked Data. More precisely, this work explores the collective ranking approach formalized as a weighted graph model, in which the mentions in the input text and the candidate entities from knowledge bases are linked using the local compatibility and the global relatedness. Experiments on the datasets of the Open Knowledge Extraction (OKE) (https://github.com/anuzzolese/oke-challenge) challenge with different configurations of our approach in each phase of the linking pipeline reveal its optimum mode. We investigate the notion of semantic relatedness between two entities represented as sets of neighbors in Linked Open Data that relies on an associative retrieval algorithm, with consideration of common neighborhood. This measure improves the performance of prior link-based models and outperforms the explicit inter-link relevancy measure among entities (mostly Wikipedia-centric). Thus, our approach is resilient to non-existent or sparse links among related entities.

In parallel, an approach to automatically annotate texts in the cosmetics field with the vocabularies ProVoc and GoodRelations in RDF has been proposed, resulting in a knowledge base in the format of the Semantic Web that can be used in various applications. Given the entity linking tool described before (that allows to link named entities in a text with entities in the LOD), we focused on the extraction of relations between these entities (in French texts). In the extraction process, particular attention is given to the contribution of syntactic rules, in order to improve accuracy with respect to existing systems.

Community Detection and Interest Labeling

Participants : Zide Meng, Fabien Gandon, Catherine Faron-Zucker.

Temporal Analysis of User and Topic

Based on previous work on overlapping community detection in Question-Answer sites, we proposed an approach to jointly model topic, expertise, activity and trends, we were able to retrieve many meaningful latent information from the user generated contents. We proposed a method to track the dynamics of topics and users. It can also track the dynamics with a specific granularity of time level such as, yearly, monthly, daily and hourly. Besides, the model can overcome a comparison problem of LDA (Latent Dirichlet Allocation) based model by modeling the reverse distribution. This work has been published in IEEE/WIC/ACM Web Intelligence [62].

Topic labeling

The output of topic model is normally a bag of words. Each topic consists of closely related words. An interesting question is to assign one or more topic label to this set in order to indicate the general meaning of a bag of words. By integrating the original dataset with linked open data sources, we are now planning to propose a generic method to automatically label the detected topics.

Default Knowledge based on the Analysis of Natural Language

Participants : Elena Cabrio, Valerio Basile, Fabien Gandon.

In the context of the ALOOF project, we developed new methods to build repositories of default knowledge based on the analysis of natural language. The first efforts are aimed at extracting information about common objects, in particular their location and their typical usage [24].

One of the methods to extract general knowledge from text is implemented in the KNEWS pipeline, of which a demo was presented at ECAI [25]. At the same conference, we also presented the results of another system that helps robots identifying unknown objects based on their proximity with known objects observed at the scene [52]. KNEWS was also used to automatically build a large collection of text aligned with RDF semantics representation of its meaning. The first envisioned application of such resource is to provide a basis for robust natural language generation from RDF triples using statistical methods [22].

We also explored the application of distributional semantics to the general knowledge extraction problem. We computed vector-based models of objects and used supervised statistical models to predict their typical locations (e.g. knife-kitchen, printer-office) [27]. Once our models were successfully tested experimentally against a gold standard of human judgments, we were able to build a large knowledge base of object locations freely available (https://project.inria.fr/aloof/data/).

Semantic Modeling of Social, Spatiotemporal and Dedicated Networks

Participants : Amel Ben Othmane, Nhan Le Thanh, Andrea Tettamanzi, Serena Villata.

During the academic year 2015/2016, we have been working partially on validating the model we proposed in [72]. A long version of this former paper, entitled An Agent-based Architecture for Personalized Recommendations will be published in January 2017 in the LNCS series published by Springer. For this purpose, we proposed in [29] a multi-agent based simulation on NetLogo environment in order to illustrate the usefulness and feasibility of the proposed framework in a realistic scenario. For that purpose, we evaluated the performance of the agent behaviors adopting two different strategies:

Results show that agents achieve a better performance collectively when they are in “communities”, i.e., agents with shared interests (thus similar to each other), than when they are acting as solitary agents. We believe that the issues of trust and recommendation are tightly related. For that reason, we analyzed the behavior of social agents with and without a trust model. Results show that exchanging beliefs or desires with trustworthy agents can improve the whole performance of agents.

We have been also working on extending the proposed model with spatial and temporal reasoning. A spatio-temporal belief or desire is considered as an event that is defined as a spatial relation holding in a temporal interval. For reasoning with such information, we propose to combine the Region Connection Calculus (RCC-8) formalism with Allen’s intervals algebra. Spatio-temporal data is often affected by imprecision and vagueness. To tackle this problem we believe that a fuzzy set, because its ability to represent a degree of membership, is more suitable for modeling spatio-temporal data. A fuzzy version of RCC-8 and Allen’s interval is proposed. Then we combined both approaches in order to represent and reason about imprecise spatio-temporal beliefs and desires. We worked also in validating this approach in a real-world scenario.